Duration Prediction in Mandarin TTS System
نویسندگان
چکیده
This paper reports the methodology and results of decision tree based duration prediction for a Mandarin text-to-speech system developed by the Fujitsu Laboratories. Syllable initials and finals are the basic units in this duration study. Factors influencing finals duration such as phrase boundary and phone context are discussed in detail. Experiments indicate that it is the most important determinant of finals duration whether the prosodic factor of the right phrase boundary level is below the prosodic word level or not. Furthermore, the degree of phrase boundary vowel lengthening may vary depending on the types of finals. This paper also explains methods for objective evaluation of duration prediction model. Lastly, prosody evaluation results convincing that the prosody generated by our prosody generation module is much better than that of two other popular Mandarin TTS systems.
منابع مشابه
Decision Tree based Duration Prediction in Mandarin TTS System
This paper reports the methodology and results of decision tree based duration prediction for a Mandarin text-to-speech system developed by the Fujitsu Laboratories. Syllable initials and finals are the basic units in this duration study. Factors influencing finals duration such as phrase boundary and phone context are discussed in detail. Experiments indicate that it is the most important dete...
متن کاملDuration modeling and memory optimization in a Mandarin TTS system
Current speech synthesis efforts, both in research and in applications, are dominated by methods based on concatenation of spoken units. New progress in the concatenative text-to-speech (TTS) technology can be made mainly from two directions, either by reducing the memory footprint to integrate the system into embedded system, or by improving the synthesized speech quality in terms of intelligi...
متن کاملPitch Prediction for Mandarin TTS with Mutual Prosodic Constraint
Most of current pitch prediction methods for mandarin TTS try to get pitch contours from the contextual information with a group of weights assigning. Without a good method in prosody concatenation constraint, the predicted pitch contours are not always stable because of the incomplete accordance between prosody information and text information. The paper presents a new mandarin pitch predictio...
متن کاملSyllable HMM based Mandarin TTS and comparison with concatenative TTS
This paper introduces a Syllable HMM based Mandarin TTS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results s...
متن کاملVariable Speech Rate Mandarin Chinese Text-to-Speech System
This paper presents an Hidden Markov Model (HMM)-based variable speech rate Mandarin Chinese text-to-speech (TTS) system. In this system, parameters of spectrum, fundametal frequency and state duration are generated by a context dependent HMM (CDHMM) whose model parameters are linear-interpolated from those of three CDHMMs trained by corpora in three different speech rates (SRs), i.e. fast, med...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006